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Register user's favorite sites 
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Circulate sites and create metadata 
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Extract new words (keywords) 
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Display the result of clustering 
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Fig. 
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Example of registered sites 



<site> 

<url>http://www. ibm. co. jp/</url> 

<namc>IBM Japan HoiDepage</name> 
</site> 
<site> 

<url>littp://www. ibm. coffl/</url> 

<naffle>IBM Corporal ioii</iiaine> 
</site> 
<site> 

<url>http://www. jp. ihm. coiii/shop/</urI> 

<name> Japan IBM [shopping] </naiDe> 
</site> 
<site> 

<url>http://www. jp. ibm. coin/developerworks/</url> 
<naine>de ve 1 ope rWo rks </nanie> 
</site> 



Input file (HTML etc.) 
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Metadata create feature 
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Information element extract feature 
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Attribute extract feature 
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Morpheme analyze feature 



Keyword extract feature 



Keyword clustering feature 



Input file (metadata) 



Fig. 4 
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Example of Link 



Expression in HTML file 



<a href =''http: //www. iba com/jp/softwa re/da ta/udb/v7/sei2inar. 
htlD!">DB2 UDB V7 free seminar open </a> 



Extracted information elements 



<anchor> 

// title of link 

<DCtille>DB2 UDB Y7free seminar open</DClitIe> 
// TJRL 

<url>http://www. iba com/jp/sof tware/data/udb/vT/seininar. htnil</url> 
// extracted keywords 

<kwds> 

// keywords // clustering of keywords 

<kwd><word>DB</word><c 1 ass>TO</c 1 ass></kwd> 
<kwd><word>IiDB</wordXc 1 ass>TO</c 1 ass></kwd> 
<kwd><word>V</wo rdXc 1 ass>TO</c 1 ass></kwd> 
<kwdXword> seminar <ywordXclass>13</cl as s></kwd> 
<kwdXword> open </wordXclass>13</classX/kwd> 
<kwd><word> tree </wo rdXc 1 as s> 1 3</c 1 as s></kwd> 
</kwds> 
</anchor> 



Fig, 5 



Example of text block 



Expression in HTML file 



industry talk / e - column for this week 



Extracted information elements 



<text> 
// text part 

<DCdeSCriptiOD> industry ta!k/e- column for this week 

ODCdescription) 
<kwds> 

<kwd><word>e</wordXc 1 ass>TO</c 1 assXAwd) 
<kwdXword> industry </word><c 1 ass>TO</cl ass></kwd> 
<kwdXword> column </TOrdXclass>l l</class></kwd> 
<kwdXword> talk </wordXclass>13</class><7kwd> 
<kwdXword>this week </wordXclass>ll</class></kwd> 
</kwds> 
</text> 



Fig. 6 
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New information extract and display feature 
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Keyword statistics feature 



Keyword significance 
calculate feature 



Clustering feature 



Clustering result display feature 
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Metadata DB 




Sites' topic 

supply 
capacity DB 

■ T 



13 



User specified 
weighting DB 
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Cluster 1 



Keyword set 






database, DB, version 




Information element set 






^Oeveiopment tool, e-commefce. operating system, database, Lotus products, network - related 

f With a set of relational tables stored in the JOBC compliant relational database management system (D82. Oracle, etc.), 
XML access service Lightweight Extractor (XLE) extracts data from the database and translates the extracted data 
to XML document and assembles 

♦ Database which supports e- business further evolves ! 

D82 universal daiabase V7 released 
0.. 





Cluster 2 



Keyword set 
version 



Information element set 

• 8 MB memory and latest Palm OS Japanese version installed 

New model of Network companion *WorkPadc3' released 
0... 



Cluster 3 



Keyword set 
DB 



Information element set 



• 092 UD8 V6.1 awarded '2000 Codic Award* 



